perm filename NOTES.INT[LSP,JRA] blob
sn#419881 filedate 1979-02-23 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00019 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00003 00002 .<<last label: P1>>
C00004 00003 .ss(Introduction)
C00009 00004 .Ss(LISP data structures)
C00012 00005 .ss(LISP operations: %3first%1)
C00023 00006 .ss(LISP operations: %3rest%1)
C00025 00007 .ss(%3list%1)
C00030 00008 .SS(Definitions and variables)
C00034 00009 .SS(Recognizers and predicates)
C00038 00010 .ss(Conditional expressions)
C00042 00011 .ss(The factorial function)
C00047 00012 .ss(%3equal%1)
C00061 00013 .ss(Summary)
C00066 00014 .ss(reality)
C00074 00015 .SS(Symbolic mathematics)
C00094 00016 .SS(Tree Searching)
C00121 00017 .SS(Data Bases)
C00122 00018 .SS(Simple computers)
C00136 00019 .SS(Input/Output: %3read%* and %3print%*,,P13:)
C00150 ENDMK
C⊗;
.<<last label: P1>>
.comment
.turn on "#"
.GROUP SKIP 12
.BEGIN CENTER
.LECTURES ON LISP
.by
.John Allen
.
.copyright %fc%1 1977 by John Allen
.end
.end of comment;
.ss(Introduction)
.select 1;
This issue of BYTE magazine intends to begins a dialog between the Artificial
Intelligence community and the small computer user. There are several
reasons for expecting this to be a fruitful venture. The most obvious is that
in the AI community %2all%1 machines are small! Both discussants in this
dialog expect their "small" machines to perform
at the limits of the hardware. Both groups frequently use their
machines in an exploratory fashion, attempting to capture a partially
understood phenomenon in an algorithmic form. This exploratory style
is supported in both groups by the interactive nature of personal computation;
neither AI machines nor personal computers cater to a batch-processing style of
computation. Given these striking similarities in computing
and the interest of the BYTE readership in things AI, it follows
that the languages of AI should be of interest and value
to the personal computation audience. The basis for modern AI languages
is LISP.
Within this issue are [⊗⊗⊗] papers exploring various aspects of LISP: the
language, its applications, and its implementations. Some topics involve
reality --the implementation of LISP-like languages for present day microcomputers;
others are "mind-stretchers" --the applications of LISP-like languages for the
specification of AI problems.
This paper introduces the language and
develops a perspective
from which to view the detailed articles.
LISP is simple and difficult, elegant and ad#hoc; it is a beautiful blend
of foresight and fortuity. LISP is a programming language, typically characterized
as a "special purpose list-processing language". But LISP is no more a special
purpose programming language than mathematics is a "special purpose
language for floating-point computations". LISP is also a
notation for describing general computations.
Just as there's more to
mathematics than the accounting and bookeeping properties presented
in Algol-like languages, there's much more to LISP
than "just another programming language".
The best description of the LISP programming language is that %3it is a
high level machine language%1. That is, it shares many of the
facets of contemporary machine language --the necessity for attention to
detail, the freedom to manipulate the machine's data and programs without
restriction--; yet LISP is "high level" in that the language contains
the expressive power and convenience of traditional high level languages.
The "contradiction" is resolvable: a LISP machine is just a higher level
machine whose data items are organized differently than the
binary bit patterns of most machines, and the LISP programming
language is the "assembly language" for this machine.
.Ss(LISP data structures)
Before introducing the constructs of the language, we must
discuss the data items of the language.
In a traditional language we would find numeric constants like
1, and 43.
In LISP our analogous
constants are called %2atoms%1. An atom is either a numeral
or an identifier; that is, a string of upper case alpha-numeric
characters subject to the proviso
that the first character in the string is an alphabetic character.
These non-numeric atoms will be called %2literal atoms%1 or %2identifiers%1.
For example, %3ABC123, 12, %1and %3NIL%1 are atoms, but %31A2%1 and %3(A#B)%1
are not.
LISP also has complex constants called %2lists%1. Lists are built out of
atoms and other lists.
.PT24
.FP
%21.%1 Any atom or list can be an element of a list.
.FP
%22.%1 Given any collection e%41%1,#...#,e%4n%1 of list elements,
then %3(%1#e%41%1#...#e%4n%3 )%1 is also a list.
.pt24
So %3(A B)%1 is a list; as is %3(A B C)%1, and %3(A 1 (ABC 23)) %1.
.PT24
.fp
The last example is a list of three elements;
its third element is also a list#--#of two elements: the atom %3ABC%1
and the numeral %323%1.
Atoms and lists are the basic LISP data structures which will concern
us through most of this discussion. A robust production version of LISP
includes many more data objects including arrays, arbitrary precision numbers,
strings. Later within this article we will introduce general record structures
and functions as data objects; both of these items occur within the
LISP implementation and made avaiable to the user.
All data objects are "first class objects", available without restriction,
as values of variables.
.ss(LISP operations: %3first%1)
We need some
operations on these data structures.
Just as we should have a %3SUB%1traction operation in arithemetic
machines to decompose numbers, we have LISP instructions to decompose
lists.
One decomposition operation is called %3first%1; it
extracts the first element of a list. For example:
.BEGIN CENTERIT;
←%3first[%3(A B C)%*] %1gives %3A%1
.pt24
.END
This example is written in LISP's "external syntax" called
called "meta LISP"; it is an instance of prefix notation.
The programming language --the
"internal notation"-- is called "S-expression LISP" or S-LISP.
Initially,
we will present algorithms in the M-expression
syntax since it is close to the more traditional programming notation;
however, since S-LISP is the language of our machine
we will insist on developing facility with that notation.
In a traditional architecture, both instructions and data are stored
in memory as patterns of bits. The processor usually has complete
freedom to manipulate those bits either as data items or as instructions.
An object accessed by the instruction counter is interpreted as an instruction;
other accesses to items usually imply a data interpretation.
One goal is the representation of LISP instructions as data items in
the LISP machine such that the processing unit of the
LISP machine will have equal
flexibility in interpreting the encoded information. An object
many sometimes play the role of program, and sometimes data.
To represent program as data
we must specify a translation of each M-LISP instruction
into a list representation.
.BEGIN CENTER;
%2External Notation%1
%1<operation>%3[%1 <operand>%41%3;%1 ...%3;%1 <operand>%4n%3 ]%1
.END
.BEGIN CENTER;
%2List Notation%1
%3( %1<operation>%8T%1 <operand>%41%8T%1 ... <operand>%4n%8T%3 )%1
.END
.PT24;FP
where the %8T%1 means perform the translation process
recursively.
For this translation to be meaningful we must also describe
how the process is to terminate.
.pt24
.BEGIN INDENT 0,2,2;
%21.%1 An "operation" in prefix notation is something like "%3first%1"
or "+", whereas an "operation%8T%1" must be an atom
or a list.
We translate the operation name to
an appropriate atom: %3first%1 translates to
%3FIRST%1, + to %3PLUS%1, and * to %3TIMES%1.
.PT24;INDENT 0,2,2;
%22.%1 The other components which have appeared in M-LISP expressions are
constants
--atoms and lists; we translate a constant %7a%1 to the construct
%3(QUOTE#%7a%3)%1. For example, we represent the constant %3(A#B)%1 as
%3(QUOTE#(A#B))%1. This solution is similar to the "quoting" convention
of natural language: Cleveland is a city, but "Cleveland" is a nine-letter
word. The %3QUOTE%1 operator is more than simple pedantry; it
will play a critical role in fetch operation of the LISP machine.
.END
.PT24
To summarize,
our instruction format will be a list consisting of a representation of
operation, followed by the
representations of the operands.
Those operands may themselves specify operations, but may specify
constant operands by using the %3QUOTE%1 operation. For example,
we represent
.FP
%3first[(A#B#C)]%1 as %3(FIRST (QUOTE (A B C))) %1and
.pt24
.FP
%3(FIRST (FIRST (QUOTE ((A B) C))))%1 represents %3first[first[((A B) C)]]%1.
.PT24
Values are obtained on a LISP machine much like one obtains values from
a pocket calculator.
We type in an S-LISP
expression, and the calculator displays the result. The evaluation of the
input expression can be quite involved; if an operand specifies a further
operation, then the current instruction must be suspended while that subsidiary
computation is performed.
So, evaluating ###%3(FIRST (FIRST (QUOTE ((A B) C))))%1### would involve the following:
.PT24
The leftmost %3FIRST%1 must wait since its operand requires evaluation;
similarly the next %3FIRST%1 must wait to take care of
its argument. But its argument is a %3QUOTE%1-ed expression.
#%3QUOTE%1 is kind, requiring no further computation,
but always returns its argument as value.
Here it returns the list %3((A#B)#C)%1. The inner
%3FIRST%1 completes now, returning %3(A#B)%1 to the outermost
%3FIRST%1; it is nudged into activity and finally returns %3A%1.
.pt24
Consider###%3(FIRST (QUOTE (FIRST (QUOTE (A B))))%1. Notice that
the embedded expression %3(FIRST#(QUOTE#(A#B))%1 has the appearance of a
LISP instruction.
However, that expression is surrounded by %3(QUOTE#...#)%1;
Therefore it is simply a list; i.e., a constant. The final result of the
evaluation will be the atom %3FIRST%1 (since the computation
encodes the M-expression %3first[(FIRST (QUOTE (A B)))]%1).
Since %3QUOTE%1-d expressions appear so frequently,
we introduce an abbreviation. We write ####%3(QUOTE#%7a%3)%1 as %9'%7a%1
.PT24
.FP
%1So the previous example ###%3(FIRST (QUOTE (FIRST (QUOTE (A B))))%1
could be expressed as: ###%3(FIRST %9'%3(FIRST (QUOTE (A B)))%1;
or as ###%3(FIRST %9'%3(FIRST %9'%3(A B))%1.
.ss(LISP operations: %3rest%1)
We also have an "instruction" named %3REST%1;
You may either think of the
instruction as a machine operation or as the translation of
an M-LISP expression. %3REST%1,
like %3FIRST%1, expects a list as its
argument. However, %3REST%1 returns a value
representing the list with the first element removed.
.BEGIN CENTERIT;group;
←%3(REST %9'%3(A B C)) %1yields %3(B C)
←%3(REST %9'%3(B C)) %1yields %3(C)
.apart;
.END
%1What about: %3(REST %9'%3(C)) %1?
When we remove the last element from a list
we get the empty list; its representation in LISP is "%3(#)%1".
The operations %3first%1 and %3rest%1 are called %2selector%1
functions since there are used to select components from a composite
data object.
.ss(%3list%1)
Besides decomposing objects, we must be able to build new objects.
The general name for an operation which build new objects is a %2constructor%1.
One LISP constructor is %3LIST%1; Here are some examples of its usage.
.PT24
.BEGIN INDENT 0,5,5;
%21.%1 %3(LIST %9'%3A %9'%3B %9'%3C)%1 yields %3(A B C)%1.
.group;
.indent 0,5
%22.%1 %3(LIST 2 %9'%3B)%1 yields %3(2 B)%1. Note
that we didn't "%3QUOTE%1" the %32%1; the
machine understands that numbers are constants.
Second, the %3LIST%1 operation will take an arbitrary number of operands;
three in our first example, two in this one, and none in the next.
.indent 0,5,5
%23.%1 %3(LIST )%1 yields %3( )%1.
.END
.pt24
As with the other instructions (except %3QUOTE%1), %3LIST%1 can
handle instructions as operands.
.PT24
.FP
Try: %3(LIST (FIRST (QUOTE (A))) (REST (QUOTE (A B))) (QUOTE C))%1.
.PT24
Dilligence may have been rewarded and you may have responded:
"%3(A#(B)#C)%1". There's an equal probability that you got mired in
parenthesis-counting and responded "(?!$≡)". One solution is to resort to
M-LISP and recast the expression as: %3list[first[(A)];rest[(A B)];C]%1
However, we are attempting to develop our S-LISP expertise, so we should
find some way to make S-LISP expressions more readable.
We might
also use our abbreviation:##%3(LIST (FIRST %9'%3(A)) (REST %9'%3(A B)) %9'%3C)%1.
A more general technique is
%2pretty-printing%1.
Pretty-printing exploits
additional lines and spaces to highlight the structure in
complex expressions. For example:
.BEGIN CENTER SELECT 3;TABIT3(8,45,52);
(LIST\(FIRST (QUOTE (A)))\(LIST\(FIRST %9'%3(A))
\(REST (QUOTE (A B)))%1#######or%3 \\(REST %9'%3(A B))
\(QUOTE C))%1\\%9'%3C)%1
.END
.PT24
.FP
In a modern LISP implementation
we would find further aids for locating matching parentheses, just as an
interactive Algol-like language would have aids for locating matching
`begin-end" pairs.
.ss(%3concat%1)
Another S-LISP operation for building lists; it is %3CONCAT%1.
It is a two-address instruction; its first operand can either
be an atom or a list, but its second operand %2must%1 reference a list.
The effect of %3CONCAT%1 is to build a new list
whose first element is the first argument of the %3CONCAT%1 and remainder
of the new list is the second operand of %3CONCAT%1.
For example %3(CONCAT##%9'%3A##%9'%3(B))%1##
would evaluate to %3(A#B)%1.
Note that %3LIST%1
takes an arbitrary number of arguments
and builds a list whose elements are those arguments.
On the other hand, %3CONCAT%1 takes
only two arguments --an element and a list-- and adds the
element to the front of the list. For example,
.BEGIN tabit2(15,45); SELECT 3;
\(LIST##%9'%3(A)##%9'%3(C)) \%1gives%3 ((A) (C))%1
while\%3(CONCAT##%9'%3(A)##%9'%3(C))\%1gives%3 ((A) C)%1
.END
.PT24
Now we can decompose lists and make new ones. We can perform evaluation of
simple expressions, much like the facilities of a hand calculator.
.SS(Definitions and variables)
The interactive nature of a LISP calculator is greatly enhanced by its
ability to define new operations.
For example, assume we wanted to define a new operation called %3second%1
which extracts the second element of a list:
.BEGIN SELECT 3;CENTER;
second[(A B C)]%1 should give %3B%1.
.END
.FP
We might write in informal M-LISP:
.BEGIN SELECT 3;CENTER;
second[x] <= first[rest[x]].
.END
.PT24;FP
We could read this definition as "define a function named %3second%1 which expects
one operand; we will name that operand, %3x%1. Then
the effect of %3second%1 is to perform
%3first%1-of-the-%3rest%1 of %3x%1."
We might introduce this new instruction to a LISP machine by:
.BEGIN SELECT 3;CENTER;
(DEFINE SECOND (X) (FIRST (REST X))).
.END
.PT24
One new feature is the occurrence
of %3x%1 (or %3X%1 in the representation). Up to this point we have
dealt solely with operations on constants. Now we're introducing
variables; that is %3x%1 is a variable and %3X%1 is its representation.
As long as a LISP machine can find a value associated with a variable,
a computation
can proceed as if we had used constants.
Once the definition has been processed by the machine we
can use that new operation as if it were part of the original calculator:
.BEGIN CENTER;SELECT 3;
(SECOND##%9'%3(A B C)))
.END
.fp
The execution of this expression would proceed just as before, only now
the machine must reconcile the form of the input expression with the form of the
definition, in essence reducing %3(SECOND##%9'%3(A#B#C))%1 to
%3(FIRST#(REST##%9'%3(A#B#C)))%1.
Thus %3(SECOND##%9'%3(A#B#C))%1
gives value %3B%1, since the LISP machine associates the list
%3(A#B#C)%1 with %3X%1 --#the representation of the variable %3x%1.
.SS(Recognizers and predicates)
In traditional assembly language programming we find
instructions which test for zero or compare two numbers.
In LISP we manipulate data objects built from
atoms and lists. The "zero" of lists is the empty list, %3(#)%1; we must be able
to test for an occurrence of %3(#)%1.
Since elements of a list can either be atomic or lists themselves
we must be able to distinguish between these kinds of objects.
Finally, we must be able to distinguish between two non-identical atoms.
All LISP operations compute values. The values which the previous
operations produced were atoms or lists; these new operations
called %2predicates%1 produce "truth values" --true or false,
or false. In M-LISP, we represent true and false as %et%1 and
%ef%1;
however, in S-LISP, these truth values must be represented as
data items and so
we pick the atoms %3T%1 and %3NIL%1 as their representations.
.PT24
.BEGIN INDENT 0,7,5;
%21. %3EQ%1: Compare two atoms. That is, %3EQ%1 is a two-address
instruction which gives value %3T%1 just in the case that
those operands represent the same atom.
.INDENT 0,10,5
%22. %3ATOM%1: This single-address instruction gives %3T%1 if its operand
is an atom, and gives %3NIL%1 otherwise.
.INDENT 0,10,5;
%23. %3NULL%1: This single-address instuction gives %3T%1 just in the
case that its operand is the empty list, %3(#)%1.
.END
.PT24
.BEGIN GROUP;TABIT2(20,56)
For example:\%2S-LISP\M-LISP%3
\(ATOM##%9'%3A) %1gives %3T%3\atom[A] %1gives %et%3
\(ATOM##%9'%3(A)) %1gives %3NIL%3\atom[(A)] %1gives %ef%3
\(EQ##%9'%3A##%9'%3B) %1gives %3NIL%3\eq[A;B] %1gives %ef%3
\(NULL##%9'%3(A B)) %1gives %3NIL%3\null[(A B)] %1gives %ef%3
.END
.pt24
.GROUP;
Since the predicates are value-producing they can be used with the other
list-manipulating operations:
.BEGIN SELECT 3;TABIT2(10,23);group;
\(CONCAT\(ATOM##%9'%3A)
\\(LIST 1##%9'%3A)) %1gives %3(T 1 A)%1
.END
.APART
Notice that the %3atom%1 predicate is of slightly different character than
%3eq%1 and %3null%1. Namely %3atom%1 is performing a "type test" on a
data structure; such predicates are called %2recognizers%1.
.ss(Conditional expressions)
Clearly, the meaningful use of predicates and recognizers requires
the existence of language constructs to modify the program flow.
Such constructs are called %2control structures%1. One
basic control unit in LISP is called the %2conditional expression%1.
In M-LISP it is written:
.PT24
.FP
%3[%2<p%41%2>%3 → %2<e%41%2>%3;%2 <p%42%2>%3 → %2<e%42%2>%3;%1
... %et%3 → %2<e%4n%2>%3]%1
.PT24
The %2meaning%1 of such a conditional expression is as follows:
.BEGIN INDENT 5,5,5
Each %2<p%4i%2>%1 is a predicate; the %2<e%4i%2>%1's are arbitrary LISP
expressions. We evaluate the
%2<p%4i%2>%1's from left to right, finding the first with gives value
true. The value of the conditional xpression is the value of the corresponding
%2<e%4i%2>%1. If none of the
%2<p%4i%2>%1's are true, then the value of the conditional is
%2<e%4n%2>%1.
Notice that this last case is really forced upon us since the last
"predicate" is the constant
%et%1. It is common to read %et%1
used in this
context as "otherwise".
.END
.GROUP;
.fp
We extend our M-to-S mapping to include this new construct, mapping it to:
.BEGIN TABIT1(12);
%3(COND(\(%1<predicate%41%1>%8T%1 <expression%41%1>%8T%3)
\(%1<predicate%42%1>%8T%3 %1<expression%42%1>%8T%3)
\...
\(%3T%3 %1<expression%4n%1>%8T%3))%1
.END
.apart
.FP;PT24
The evaluation of a conditional expression is different from the
technique we have used in previous LISP instructions. Previously we have
insisted that we evaluate all of the operands in an instruction. In the conditional
expression, we evaluate the minimal part of the conditional
which gives us a true predicate; then we evaluate the corresponding expression.
.GROUP;
For example:
.BEGIN GROUP;SELECT 3; nofill
.krk
(COND ((ATOM##%9'%3A)##%9'%3FOO) (T 1)) %1gives value %3FOO%1
since %3(ATOM##%9'%3A)%1 gives %3T%1
%3(COND ((ATOM##%9'%3(A))##%9'%3FOO) (T 1)) %1gives value %31%1
since %3(ATOM##%9'%3(A))%1 gives %3NIL%1
.END
We have introduced all the instruments on the LISP orchestra. Now it's
time to make some music.
.ss(The factorial function)
.group
Our first example is the most venerable of LISP programs:
an algorithm to compute the factorial function:
.BEGIN GROUP;TABIT2(10,15)
\\%31%1 if %3n%1 is %30%1
\%3n!%1 =
\\%3n*(n-1)!%1 if %3n%1 %a> %30%1
.END
.APART;
.FP
We want to convert the informal description
into a complete LISP algorithm.
The "if"-structure can be converted into a conditional expression,
and we can name the new operation %3fact%1. We need a multiplication
operation; we can assume our LISP machine has such an operation, and
we also
assume the existence of a
simple subtraction function, %3sub1[x]#=#x-1%1.
.GROUP;
.fp
Here's a factorial algorithm in M-LISP:
.BEGIN SELECT 3;center;
fact[x] <= [eq[x;0] →1; times[x;sub1[x]]]
.end
.APART;
.GROUP;
.fp
and here's its pretty-printed translation in S-LISP:
.BEGIN GROUP;SELECT 3;TABIT1(35);
.KRK
.fp
(DEFINE FACT (X) (COND\((EQ X 0) 1)
\(T (TIMES X (FACT (SUB1 X))))))
.END
.APART;
.pt24
The new ingredient in these definitions is the use of %2recursion%1.
A typical recursive definition has several characteristics:
.BEGIN INDENT 0,2,2;
%21.%1 The body of the definition should be a conditional expression.
A definition like %3foo[x]#<=#baz[foo[bar[x]]]%1 will cause nothing
but grief. The conditional expression will contain two basic
parts: the %2termination case%1 and the %2general case(s)%1.
.indent 0,2,2;
%22.%1 The termination case describes what to do when a primitive data structure
is recognized. We
consider the integers built from zero, using the successor
function, %3add1%1. Therefore, our termination case in %3FACT%1 involves
recognition of %30%1, and terminates with value %31%1.
.indent 0,2,2
%23.%1 The general cases involve "composite" data structures.
Since
we assume that the positive integers are built up from zero, we can
decompose a positive (composite) integer down to zero by a sequence
of subtraction-by-one operations.
The essential idea is that reducing the complexity of the argument
in a recursive call will thereby reduce the complexity of the problem.
That's an old trick; what recursion says is that we can solve the
original problem by reducing it to a simpler case of the %2same%1
problem. If we persist, the problem will solve itself before we get tired
of reducing; it's like dieting.
.pt24
.END
A recursive definition is similar to an inductive description like those
we gave for defining lists data structures, or the M-to-S mapping.
The techniques involved in finding the right
inductive steps are similar to those involved in finding the right
decomposition in a recursive definition.
Recursive definition
is a powerful descriptive technique; it can also be implemented
as a very efficent computational mechanism.
.ss(%3equal%1)
For a further example,
assume that we want to test the equality of
two lists, where equality means that each element of two lists is identical
and the order in which those elements occur is identical. The identity relation
also extends to sub-elements of lists. For example:
.BEGIN GROUP;TABIT2(10,45)
.krk
\%2equal\ non-equal%1
\%3(A B C) (A B C)\(A B C) (A B D)
\%3(A (B C) D) (A (B C) D)\(A (B C) D) (A D (B C))
\%3( ) ( )\(A (B (C) D)) (A B C D)
.END
.pt24
.APART;
.fp
Let %3EQUAL%1 be
an algorithm to compute this extended equality.It will be
be recursive.
Regardless of the complexity
of objects, all we need to do is find the
right way to decompose them, and then pounce on the pieces.
The decomposition operators we have for lists are %3FIRST%1 and %3REST%1.
We also have to stop the decomposition. In %3FACT%1 we tested
for the occurrence of zero; in %3EQUAL%1 we test for the occurrence of
an empty list, and since we are assuming that elements of a list may either
be sublists or atoms, we need to test for the occurrence of an atom.
Let's try the simplest possible case first: the empty list.
.BEGIN SELECT 3;GROUP;NOFILL
(DEFINE EQUAL (X Y) (COND ((NULL X) ...? )
.END
.pt24
.fp
What should we do? If %3x%1 is empty, then we will only have equality
if %3y%1 is also empty; otherwise we will have an inequality.
.BEGIN SELECT 3;GROUP;TABIT2(39,48);
(DEFINE EQUAL (X Y) (COND (\(NULL X)
\(COND (\(NULL Y) T)
\\(T NIL)))
\...?)
.END
.fp
Note that we embedded a conditional expression
within a conditional expression.
Note also that the interior conditional returns either
%3T%1 or %3NIL%1; but that's what we wanted since %3EQUAL%1 is to encode
a %2predicate%1 and %3T%1 and %3NIL%1 are our representations of the
truth values %et%1 and %ef%1.
Note too that we depend on the order dependence of the conditional evaluation;
we won't test the %3(NULL#Y)%1 expression unless %3(NULL#X)%1 is true.
We won't get to the "%3#...?%1" condition unless %3(NULL#X)%1 is false.
.GROUP;
We can still have
%3x%1 non-empty and %3y%1 empty; let's take care of that:
.BEGIN SELECT 3;GROUP;TABIT2(39,48);
(DEFINE EQUAL (X Y) (COND (\(NULL X)
\(COND (\(NULL Y) T)
\\(T NIL))
\((NULL Y) NIL)
\...?)
.END
.APART;
%2Now%1 the "%3...?%1" has been reduced to the case that %2both%1
lists are non-empty, and we can massage the pieces with %3FIRST%1
and %3REST%1. We look at the %3FIRST%1 pieces; if they're equal, then
our decision on the equality of the original lists depends on the
equality of the remainders (or %3REST%1s) of the lists. If the
%3FIRST%1s are %2not%1 equal, then we can stop immediately with a
false indication. This analysis yields two cases: 1)#if the first
elements are atomic, then use %3EQ%1 to check their equality; 2)#otherwise
use %3EQUAL%1 itself on the first elements. Here we go:
.BEGIN SELECT 3;GROUP;TAbS 3,12,34,39;TURN ON "\";NOFILL;
.KRK
(DEFINE EQUAL(X Y)
\(COND(\(NULL X)(COND(\(NULL Y) T)
\\\(T NIL))
\\((NULL Y) NIL)
\\((ATOM (FIRST X))(COND\((ATOM (FIRST Y))(EQ X Y))
\\\\(T NIL)))
\\((ATOM Y) NIL)
\\((EQUAL (FIRST X)(FIRST Y))(EQUAL (REST X)(REST Y)))
\\(T NIL)))
.END
.SS(%3reverse)
So far our examples have either been numerical or have been predicates.
Predicates only require traversing existing lists; certainly we will
want to write algorithms which build new lists. Consider the problem
of writing a LISP algorithm to reverse a list %3x%*. There is a simple informal
computation: take elements from the front of %3x%* and put them
onto the front of a new list %3y%*. Initially, %3y%* should be %3(#)%* and
the process should terminate when %3x%* is empty.
.GROUP;
For example, reversal of the list %3(A B C D)%1 would produce the sequence:
.BEGIN SELECT 3; TABIT2(32,49);
\x\y
\(A B C D)\( )
\(B C D)\(A)
\(C D)\(B A)
\(D)\(C B A)
\( )\(D C B A)
.END
.APART
This %3reverse%* function builds up the new list
by %3concat%*-ing
the elements onto the second argument of %3rev%9'%*%1.
.BEGIN CENTERIT;SELECT 3;GROUP;
.P97:
←reverse[x] <= rev%9'%*[x;( )]
←rev%9'%*[x;y] <= [null[x] → y; rev%9'%*[rest[x];concat[first[x];y]]]
.APART
.END
Since %3y%* was initialized
to %3(#)%* we are assured that the resulting construct will be a list.
We leave it to the reader to translate this algorithm into S-LISP.
.GROUP;
As a final example, here's another algorithm for the factorial function.
You might be amuse yourselves by proving the equivalence of the two
formulations.
.BEGIN TABIT1(10);SELECT 3;GROUP;CENTERIT;
←fact%41%*[n] <= fact%9'%*[n;1];
←fact%9'%*[n;m] <= [eq[n;0] → m; fact%9'%*[sub1[n];times[n;m]]]
.END
.APART;
.ss(Summary)
******car,cdr, cond**** reconcilation****
We will will now explore several reasonably detailed examples,
both as an introduction to the authors and to show LISP is a setting
more concrete that the manipulation of abstract list structure.
We need to show #how to relate abstract ideas
to concrete reality.
There are no "seven steps to programming perfection"
in LISP;
however, we %2can%1 give insights into programming styles which
can control the complexity of the programming problem.
The trick lies in following a fine line between the abstract and
the concrete. If the algorithm/data structure is over-specified (too concrete)
then the program will be too machine dependent; too unchangeable; and
most important, too difficult to read.
.ss(reality)
***prog envs*****
laubsch vrp, rww
***lisp machine in lisp****
(Operating Systems and Language Systems)
Lisp is a good "systems programming language".
This section will discuss several aspects of LISP which fall roughly
into the operating system area. This will involve discussion of the building
blocks as well as the actual language sub-system components like
editors and debuggers.
Though we will put off a full discussion of editing and debugging
programs until
a later article, we can give a flavor of the "life-support" systems
which LISP users have come to expect.
We have touted the program/data duality as an essential difference
between LISP and other "high-level" languages; we have yet to demonstrate
the usefullness of this property. That we remedy now.
Most of a program's lifetime is spent being a data structure. That means
the program is lying around being operated on by some other active element.
If that is so, then representing programs as data structures which are
easily manipulated such be of primary concern to language designers.
Let's examine the validity of the claim and then discuss some of the
benefits. First, when a program is under construction it is a data structure;
we are doing text construction and modification. When a program is being
compiled it is a data structure; in traditional languages,
the external form of the program is translated into an internal
tree-form, similar to LISP, and that tree structure is traversed to
produce code.
Tony Hoare
once said something to the effect that a good language designer would
rather avoid a problem rather than solve it. LISP is very good; it
ignores as many problems as possible. Translating from external form
to tree-form --#called parsing#-- is a non-trivial problem for most languages;
indeed, many compiler writers become so emamoured with the problem that they
forget that there's any thing more to do. LISP ignores the problem; a LISP
parser (as well shall see) is quite trivial. Debugging compiled code can be
quite painful; relating low-level error symptoms with high-level source
statements requires good karma. LISP ignores that problem too; since
the program is available as a data structure, we need only write another
LISP program to monitor the behavior of the errant LISP program and
let us know when specified anomalies occur. The problem of correcting
bugs once they have been isolated is typically painful; execution must
be aborted and the original text modified. Again, since the LISP
program %2is%1 a data structure, another LISP program called an editor
can correct the damage and most LISP systems, the execution can
be reset to an earlier point and continued.
LISP is interactive. You create LISP programs like you drive a car.
Your program is not built and then thrown into a meat-grinder to
succeed or fail. Parts are built interactively and incrementally.
LISP programming style is best characterized as middle-out rather than
top-down or bottom-up.
****editor as subst***
dynamic strucutre
compilers
editors
debuggers
true interaction(??)
.SS(Symbolic mathematics)
When we approach a contemporary calculator we expect to receive
answers to questions like
"What's %34+(2*2*3)+9%1?"
Some calculators can even answer questions like
"What's
%3x%82%3#+#2xy#+y%82%1, when %3x%1 is %32%1 and %3y%1 is %33%1?"
However, if we were to ask "What's %32x+y+3x%1?"
our calculator would complain, even though a
person might very well respond "%35x+y%1".
Rather than begin with the kinds of algebraic simplification
involved in our "%35x+y%1"-example, we jump in at the deep end and examine
a symbolic computation which is basic to a calculus course: differentiation.
If you'd rather not read about calculus you may safely skip to another section;
the remainder of the material is not dependent on this section.
Differentiation %2is%1 a symbolic operation; it operates on expressions and
produces expressions. Differentiation is also much easier to
specify algorithmically than "simplification"; a "simplification" in one context
is a "complexification" in another. Is
%34+y(4+y)%1 simpler than %34+4y+y%82%1 or %3(2+y)%82%1 or %34(1+y)+y%82%1?
We can find circumstances in which any one of these expressions is the
appropriate "simplification" given that the expression combines with a
larger expression. Differentiation is more clear-cut.
Note that the differentiation algorithm is recursive.
The question of differentiating a sum is reduced
to the ability to differentiate each summand.
In particular,
#d[%3f + g%*]/d%3x%* = d%3f/%*d%3x + %*d%3g%*/d%3x%1, with
similar relationships
holding for products, differences, and powers.
As usual in recursive definitions, there must be some termination
conditions.
Differentiation of a variable, say %3x%*, with respect to %3x%* is defined to be 1;
differentiating a constant, or a variable not equal to %3x%* with
respect to %3x%* gives a result of zero.
More generally, if %3u%* is a constant or a variable then:
.GROUP
.BEGIN TABIT2(20,30);
\d%3u%*/d%3x%* =\1 if %3x = u
\\%10 otherwise
.END
.APART
We want to express the differentiation algorithm for a class of polynomials.
Though polynomials can be arbitrarily complex, involving the operations of addition,
multiplication, negation, and exponentiation, their general format is very simple
if they are described in prefix notation like M-LISP, where the operation
precedes its
operands. We assume that binary plus, times, and exponentiation are
symbolized by +, *, and ↑; we will
write %3+[x;2]%* instead of the usual infix notation %3x+2%*.
We can be more specific about this class of polynomials:
.BEGIN INDENT 0,5;group;
%21.%* constants and variables are terms.
.krk
%22.%* If t%41%* and t%42%* are terms then the "product"
of t%41%* and t%42%* is a term.
.krk
%23.%* If t%41%* is a variable and t%42%* is a constant then
"t%41%* raised to the t%42%*%8th%1 power" is a term.
.krk
%24.%* If t%41%* is a term then "minus" t%41%* is a term.
.apart;
.GROUP;
.krk
and finally:
.krk
%25.%* Any term is a polynomial.
.krk
%26.%* If p%41%* and p%42%* are polynomials then the "sum"
of p%41%* and p%42%* is a polynomial.
.END
.APART
.GROUP;
Here is a BNF description of the above set using
the syntax of prefix notation:
.BEGIN TABIT1(13);GROUP;
<poly>\::= <term> | +[<poly>;<poly>]
<term>\::= <constant> | <variable> | *[<term>;<term>] | ↑[<variable>;<constant>]
\::= -<term>
<constant>\::= <numeral>
<variable>\::= <identifier>
.END
.APART;
As we have seen, it is easy to write recursive
algorithms in LISP; the only problem here is that the domain and
range of LISP functions is lists, not the polynomials.
We need to represent arbitrary polynomials as lists.
We use exactly the scheme we applied to the M-to-S LISP mapping.
.GROUP;
For example:##%3x%82%* + 2yz + u%1
will be translated to the prefix notation:
%3[↑[x;2];#+[*[2;*[y;z]];#u]] %1.
From this it's easy to get the list form:
%3(PLUS (EXPT X 2) (PLUS (TIMES 2 (TIMES Y Z)) U))
%1
.APART
At this point we can begin to develop the
recursive algorithm.
We will represent the d-operator as a binary LISP function named %3diff%*.
The application, d%3u%*/d%3x%* will be represented as %3diff[u;x]%*.
.BEGIN TABIT2(20,30);select 3
diff[u;x] <= [isvar[u] → [samevar[x;u] → 1; %et%* → 0];
\ ?]
.END
.APART
Notice the use of the abstract predicates %3isvar%1 and %3samevar%1.
Before we run this program we must of course supply these components,
but for expository purposes in developing the algorithm it is must
clearer this way; and for modularity of programming it is much
better to refrain from encoding the representation (i.e. replacing
%3isvar%1 by %3atom%1 and %3samevar%1 by %3eq%1) into the description.
Adding the programming hacks is easy; writing clear abstract algorithms is
difficult.
.GROUP;
We continue, adding the algorithm for + and *.
We would see a sum #%3u#=#f+g%1 represented as
%3(PLUS#f%8T%3#g%8T%3)%1
In terms of our representation,
the result of differentiating such a %3u%* is a new list of three
elements:
.NOFILL
1. the symbol %3PLUS%*.
2. the effect of %3diff%* operating %3f%8T%1
3. the effect of %3diff%* operating %3g%8T%1
.fill
.APART
Abstractly, the effect is a new "sum" whose summands are the effect of
%3diff%1 operating on the arguments of the original
sum. Let us assume the existence
of two abstract selectors, %3arg%41%1 and %3arg%42%1.
Then another part of the algorithm is:
.begin tabit2(13,26);select 3;group;
.krk
\ issum[u] → list\[PLUS;
\\ diff [arg%41%*[u]; x];
\\ diff [arg%42%*[u]; x]];
.end
.GROUP;
.FILL
We know that
d[%3f*g]%*/d%3x%* is defined to be %3f* %1d%3g%*/d%3x + g *%1d%*f/%1d%*x%1;
So here's another part of %3diff%*:
%3
.begin tabit2(16,34);select 3;group
.krk
\ isprod[u] → makesum[\makeprod[\arg%41%*[u];
\\\diff[arg%42%*[u]; x]];
\\makeprod[\arg%42%*[u];
\\\diff[arg%41%*[u]; x]]];
.end
Here is the completed algorithm for sums and products.
.BEGIN TABIT3(13,31,39);select 3;
.GROUP
******do in s-lisp*****
diff[u;x] <=\[isvar[u] → [samevar[x;u] → 1; %et%* → 0];
\ isconst[u] → 0;
\ issum[u] → makesum[\diff[arg%41%*[u]; x];
\\diff[arg%42%*[u]; x]];
\ isprod[u] → makesum[\makeprod[\arg%41%*[u];
\\\diff[arg%42%*[u]; x]];
\\makeprod[\arg%42%*[u];
\\\diff [arg%41%*[u]; x]]];
\ %et%* → error[]]
.APART
.END
.select 1
And here are some of the components matching our abstract algorithm to the
current implementation of data.
.BEGIN CENTERIT; SELECT 3;group;tabit2(15,45)
issum[x] <= eq[first[x];PLUS]\isprod[x] <= eq[first[x];TIMES]
isconst[x] <= numberp[x]\samevar[x;y] <= eq[x;y]
.END
.GROUP
.BEGIN TABIT3(20,25,29);
Try:##%3diff [(PLUS (TIMES X Y) X); X]
\=list [PLUS; diff[(TIMES X Y); X]; diff [X;X]]
\=list [PLUS;
\\list [PLUS;
\\\list [TIMES; X; diff [Y;X]];
\\\list [TIMES; Y; diff [X;X]]];
\\diff [X;X]]
\=list [PLUS;
\\list [PLUS;
\\\list [TIMES; X ;0];
\\\list [TIMES; Y;1]];
\\1 ]
.APART
\=(PLUS (PLUS (TIMES X 0) (TIMES Y 1)) 1)
.END
%1
which can be reinterpreted as:%3←x*0 + y*1 + 1 .
%1
Now it is clear that we have the right answer; it is equally clear that
the representation leaves much to be desired. There are obvious
simplifications which we would expect to have done before we would
consider this output acceptable.
This example is a
particularly simple case for algebraic simplification. We can easily
write a LISP program to perform simplifications like those expected here:
like replacing %30*x%1 by %30%*, and %3x*1%* by %3x%*. But the general
problem of writing simplifiers, or indeed of recognizing
what is a simplification, is quite difficult.
A whole branch of computer science has grown up around
symbolic and algebraic manipulation of expressions. One of the
crucial parts of such an endeavor is a sophisticated simplifier.
For more details and examples of the power of such systems see
⊗↑[Hea#68]↑, ⊗↑[MAC#74]↑, or ⊗↑[Mos#74]↑.
A particularly nice mirco version of such a symbolic mathematics systems
has been built by a group at the University of Hawaii. They have
contributed a detailed paper. See ****
_]'&!)eKJ↓'KCe
QS]N$~∃αAUESck%i←kf↓MKCiUeJA←_Ag←a!SgiS
CiKH↓OC[J↓aYCs%]NASL@EBAMieCi∃OrD\4∃∪\AMS[aY∀AOC[∃g[C]MQS`X↓J]N\↓iSF[QCF[i=JXAi!JAgiICiKOdA[Cr↓EJAcUSiJAMS[aY∀X~∃S9IKKH↓KCgS1rAG←5akiC YJ\~)∪\A[=eJAG=[aYK`AOC[∃fAYS-JAGQ∃GWKeLAC]H↓GQKgLXAiQ∀ACYO=eSiQ5SFACAae←C
PASf↓S\AM=d~∃i=kOPAMYKII%]N\AQQJAQ∃CehA=LA[k
PA←L↓iQSf↓gieCQKOrA→←e[CQS←\A%fA←MQK\~∃∧AieK∀AgieUGike∀\@A)!ChAiIKJAo%YXAQ¬mJA]=IKfAIKaeKMK]iS9N@Ea=ggSE1JA[←YKfD\4∃∪\A∧AgS]≥YJ[a∃eg←\↓OC[J0AiQJAKmC1kCiS=\A←L↓iQJAQeKJA]SYXAIKgkYPAS\~)B@EE∃ghA[=mJDv↓C]rA5←mJAQQChA]S]f\↓∪\AB↓io↑[AKeg←8AOC[∀AoJA5kghA JA[←IJAGCIKMkXl~∃iQ∀AEeC9GQS]≤AgieUGike∀AoSY0AeKaIKgK] %2both%1 your moves and those
of our opponent, and the position evaluation must
take that into account: "Now if I move here, then he'll do that,#...
We will consider both kinds of tree evaluations, presenting
the simpler ones first.
Assume that any node in a tree can have any finite
number of branches, and that the
trees will terminate on all of their branches.
That's enough information for the simple case; let's start naming the parts.
We'll have a tree, %3tr%1; associated with it will be a recognizer
named %3is_term%1, which will return %et%1 if the tree is the trivial
terminal tree with no branches. A terminal tree may either be a %3WIN%1 or a
%3LOSS%1. If it's a win, we know how to achieve our goal; if it's a
%3LOSS%1, then we look further. That "further" says examine the alternatives
of the immediately preceeding node; if there aren't any alternatives
then back up to %2its%1 predecessor.
If a tree has branches we will select them with the selector %3branches%1.
.BEGIN GROUP;SELECT 3;TABIT2(18,22);
eval_tree[tr] <= [\is_term[tr] → [is_win[tr] → tr; %et%3 → %3LOSS%1]
\%et%3 → eval_branches[branches[tr]]]
eval_branches[l] <= [\null[l] → LOSS;
\\eq[LOSS;eval_tree[first[l]]] → eval_branches[rest[l]];
\\%et%3 → first[l]]
.END
We used a few undefined LISP functions, but their definitions should be clear
from context. What should be equally clear is the structure of the algorithm.
No reams of code, only two simple recursive definitions.
Translate the algorithms to list notation; supply
the missing selectors and recognizers (no constructors here), and
the programs will run.
Now we consider two-person games;
this class of game includes checkers and chess.
In the following discussions we will make several assumptions.
.BEGIN INDENT 0,2,2;GROUP;
%21.%1 Our opponent is as smart as we are. Obviously false, but
this assumption will be reflected by using our evaluation function
in evaluating the posiitions of our opponent.
.indent 0,2,2
%22.%1 We
assume that our opponent is also trying to win. Therefore his move will
reflect his attempt to do us maximal harm, trying to mix up our
winning strategy. Since we are using the same position-evaluator,
his "maximal harm" is our "minimal good".
We therefore follow a "max-min" strategy
wherein we attempt to find our best move
which our opponent cannot turn into a disaster for us.
.END
From these ideas we formulate our position evaluation strategy as follows:
.BEGIN INDENT 0,2,2;
%21.%1 Grow a tree of moves. First our possible moves from a position, then
his counter moves; then our responses,#... Continue this until the
branch terminates or until we get tired.
.indent 0,2,2
%22.%1 Once the tree is built, we evaluate the terminal nodes.
.indent 0,2,2
%23.%1 The values are propagated back up the tree using the min-max idea.
If the preceding node is ours, we assign that node the maximum of the
branch values; if the preceding node is his we assign the minimum of the
values. We proceed in this fashion, finally returning the value of the
"best path".
.END
We will simplify matters somewhat, returning only the "value" of the
best path (leaving the player in a state of fury since we won't tell
him how we found that path).
First we develop some subfunctions:
.BEGIN group;select 3;turn off "∞";tabit1(33);
maxlist[l;f] <= [null[l] → -∞; %et%3 → max[\f[first[l]];
\maxlist[rest[l];f]]
minlist[l;f] <= [null[l] → ∞; %et%3 → min[\f[first[l]];
\minlist[rest[l];f]]
.END
We should explain a few things.
First, the "%3∞%1" means "giant number, bigger than any other value
our evaluation function %3f%1 can concoct. Second, the use of %3f%1
is new.
It is used as a LISP function within the bodies of the definition, yet
is is passed in as a parameter. It is uses as a %2functional argument%1.
even though we write abstractly, we
know that the arguments to LISP functions must be lists. Fortunately, LISP
supports a representation of functions as data objects.
So LISP allows the passing of functions as
parameter, or returning functions as value.
Here ia an example:
.BEGIN CENTER;SELECT 3;
maxlist[(1 3 5 2);add1] = 6 %1and%3 minlist[(1 3 5 2);add1] = 2.
.END
With those preliminaries, here's min-max.
.BEGIN SELECT 3; GROUP;
maxpos[p] <= [is_term[p] → value[p]; %et%3 → maxlist[branches[p]; minpos]]
minpos[p] <= [is_term[p] → value[p]; %et%3 → minlist[branches[p]; maxpos]]
.END
.pt24
.fp
%3maxpos%1 gives the value of a position for the maximizing player
and %3minpos%1 gives the value of a position for the minimizing player.
##%3value%1 is the terminal position evaluation function.
What's even more interesting is that there is a simple technique which
will allow us to discover the optimal path usually without having to
visit all the nodes. The technique, discovered by John McCarthy in 1958,
is called %7a%2-%7b%2#pruning%1; it is based on the observation that
if our opponent is assured that he can force us into an unfavorable position
then he won't make a move which would give us a %2better%1 position.
That's obvious; what is %2not%1 obvious is that he can
often make such decisions
on the basis of only a partial evaluation of the tree.
Consider:
.BEGIN SELECT g;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#";
.TURN ON "∞" FOR "%";
O
~
⊂αααααα∀ααααα⊃ opponent∞1'∞gs moves
~ ~
N M
⊂αααβααα⊃ ⊂αααβαα ...⊃ our moves
~ ~ ~ ~
7 3 4 ? ...
.END
Since we are to evaluate the position at %gN%1, we maximize the position,
getting %g7%1; that becomes the value of node %gN%1. It is up to our
opponent to evaluate position %gO%1, and he now knows we're going to
get a %g7%1 if he moves to %gN%1. He looks questioningly at "?"; if that
value is greater than %g7%1 then he immediately rejects move %gM%1 without
examining the other possibilities; things can only get worse for him.
If "?" is less than
%g7%1, then he looks at additional alternatives at %gM%1. Once our
opponent is finished evaluating the position, then it's our turn
to play the game at the position above %gO%1, only now we will try to maximize
what that stingy individual has left us.
We let %7a%1 be the value which must be exceeded for a position to be
desirable by the position about to play; and let %7b%1 be the value which must
%2not%1 be exceeded if the move leading to the position would be make by the
opponent; in the above example %g7%1 is the %7b%1-value when evaluating
%gM%1. Let's modify the min-max algorithms to include %7a%1-%7b%1
pruning.
.BEGIN group;select 3;turn off "∞";tabit2(26,38);
maxlist%4αβ%3[l;f;%7a%3;%7b%3] <= [\null[l] → %7a%3;
\f[first[l]] %a≥%1 %7b%3 → %7b%3;
\%et%3 → maxlist%4αβ%3[\rest[l];
\\f;
\\max[%7a%3;f[first[l]]];
\\%7b%3]]
.end
.BEGIN group;select 3;turn off "∞";tabit2(26,38);
minlist%4αβ%3[l;f;%7a%3;%7b%3] <= [\null[l] → %7b%3;
\f[first[l]] %a≤%1 %7a%3 → %7a%3;
\%et%3 → minlist%4αβ%3[\rest[l];
\\f;
\\%7a%3;
\\min[%7b%3;f[first[l]]]]]
.END
.BEGIN group;select 3;turn off "∞";tabit1(25);
maxpos%4αβ%3[p;%7a%3;%7b%3] <= [\is_term[p] → max[%7a%3;min[%7b%3;value[p]];
\%et%3 → maxlist%4αβ%3[branches[p]; minpos%41%3;%7a%3;%7b%3]]
minpos%41%3[x] <= minpos%4αβ%3[x;%7a%3;%7b%3]
minpos%4αβ%3[p;%7a%3;%7b%3] <= [\is_term[p] → max[%7a%3;min[%7b%3;value[p]];
\%et%3 → minlist%4αβ%3[branches[p]; maxpos%41%3;%7a%3;%7b%3]]
maxpos%41%3[x] <= maxpos%4αβ%3[x;%7a%3;%7b%3]
.END
The process can be initialized with %7a%1 and %7b%1 set to %3-∞%1 and %3∞%1
respectively. Tighter bounds on "acceptablility" can be enforced by
picking different %7a%1's and %7b%1's. The effect will be to shorten the
search time will, perhaps ignoring some neat moves; %3caveat emptor%1.
This %2not%1 a trivial algorithm. However its description as a LISP
program is about as simple and as compact as you will find; anywhere.
The introduction of functions as full-fledged data objects in programming
languages dates to LISP in 1960; few languages have supported functional
data as well. Their implementation is as subtle as their power in
representing problems. In his paper,
"Functions as Object", Phil Agre discusses both aspects of their
behavior.
.SS(Data Bases)
just poor os's (w.o. procedures in base)
plists, disctionary, and crap for kornfeld
.SS(Simple computers)
Though the execution on LISP programs may seem foreign to
the architecture of a micro computer, if fact a LISP architecture
is really quite conventional. It simple requires an "unconventional"
prospective of the traditional machines.
We begin by looking at the macroscopic structure of a contemporary
machine. Its basic item of information is called a %2word%1.
Each such word contains the same number of bits.
Words are not simply scattered about like loose
paper on a desk top, rather, they are arranged neatly
like the pages in a book. Each word is assigned a "page number" called
its %2address%1.
We will call a sequential collection of words a %2vector%1.
LISP memory, on the other hand, is orgainzed as a tree with the
"next to" relationship encoded in the data items themselves:
Given a list of objects, LISP's "next to" relations is
the composition %3first[rest[x]]%1.
????????
Contemporary machines owe their existence to the
elegant idea
of a "stored program" machines wherein
both instructions and data are stored in the machine.
This added a bit of complexity since, those instructions
had to reference data; therefore some means for designating an address
was included in the instruction format. Additional complexity
occurs because the machine must also have a way of selecting which instructions
to execute. In
some older machines the instructions included an explicit field specifying
which address contained the next instruction. Most modern computers
have replaced that generality with an implicit incrementation to the next
location unless otherwise directed.
This activity is controlled by a special register called
the %2program counter%1 (%2PC%1).
If %2PC%1 references
a simple data handling instruction the machine executes the operation
and then prepares to execute the next instruction in the sequence;
that is done be adding %31%1 to the contents of the %2PC%1.
Control operations --those which modify program flow, like jumps,
conditional branches, or subroutine calls-- are performed by changing the
contents of the %2PC%1.
The basic cycle of a contemporary machine is something like:
.GROUP;
.BEGIN TABIT2(27,30);GROUP;TURN OFF "←";
\%5l:%3\C(%2IR%3) ← C(C(%2PC%3))
\\C(%2PC%3) ← C(%2PC%3) + 1
\\ %5execute %3C(%2IR%3)
\\%5go to l
.END
.APART
%3C(x)%1 is read "contents of %3x%1"; the arrow "←" is read "replaced by".
The %2IR%*, or %2Instruction register%1, is an internal register
used to hold the instruction we are to execute.
There is a similar execution algorithm for a LISP machine, except
that the "program counter" follows the tree-structure of the program.
The duality of program ad data which LISP enjoys is also present on
contemporary hardware.
Most machines assume that
any word can contain an instruction or data,
and
the interpretation of a word
depends on the way the machine accesses the word.
The contents of a location can even be used both as an instruction
and as data; if location %3101%1 contained the representation of the
instruction %3ADD#101%1, then the execution cycle for that instruction would
involve location %3101%1 as both instruction %2and%1 data.
This dual role of instruction and data occurs in less ad hoc situations.
The typical loader receives the "data" form of instructions
form the assembler,
acts on them as data items, and loads them into memory.
It does not execute them. This duality of program and data structure is
an important.
In terms of a contemporary machine this means:
.BEGIN GROUP;INDENT 5,5,5;
.pt24
If a location is referenced by the %2PC%1 then its contents is decoded
as an instruction. If a location is referenced as an operand then its
contents is taken as data.
.pt24
.END
Translated to LISP terms it will become:
.begin indent 5,5,5;group;
We have a processing unit named %3eval%1 which accesses LISP memory
if a memory element (list) is accesses in the instruction fetch
then that element is decoded as a LISP instruction, otherwise
the item is accepted as data.
.end
.pt24
We know, from
the nature of LISP operarions, that this processing operation can become quite
complex, even recursive; but the basic machine cycle is quite traditional.
We will address the implementation details (micro code) which are
necessary to support such evalaution schemes. However, more generally,
as a recursive process the evaluation of expressions must
be subject to termination conditions. We have seen those termination
conditions: the recursive evaluation of an expression terminates
when its value is a constant; "the value of 1 is 1"; the value of
"%3(A#B#C)%1 is %3(A#B#C)%1. However, what's %2inside%1 the LISP machine
is the %2representation%1 of %3(A#B#C)%1, namely %3(QUOTE#(A#B#C))%1,
so our processor also needs to watch for %3QUOTE%1 to stop its operand fetch.
But this is noting more than a slight generalization of the
indirect address mechanism of most contemporary machines!
If an address is
fetched indirectly, it means that contents of the address
are not used as data, but are interpreted as a further address
and the contents of that further address are examined. If the machine
allows arbitrarily deep indirect addressing, that further address
is examined to see if %3it%1 specifies additional indirection.
This chain of indirect addressing must terminate with a "real" address
if the instruction which instigated this search is ever to complete
its execution.
So conceptually, a LISP machine is a very conservative extension of
traditional Von#Neumann architecture. Its data is more general;
its processing unit is more general; but is stored program concept,
program/data duality, and instruction fetch are only slight
different. That slight difference in conception makes a world
of difference in power.
Of course this "conceptual LISP machine" is as weak as a "conceptual
arithmentic machine" with plus and times, say. LISP constructors, like
%3concat%1 (%3cons%1) and %3list%1, bring new objects into existence.
Those objects must be manufactured; there is a non-trivial problem
in LISP storage management. More mundane problems like input and output
must also be solved, and if this is to be a "production version of LISP"
much care needs be taken in supplying a full complement of data types
along with their storage management techniques. These problems
can either be solved with software. ⊗⊗⊗⊗greussay⊗⊗⊗ prini⊗⊗⊗⊗
An exciting alternative is to buld LISP machines in hardware, thereby
"raising the programming floor" to a much more acceptable machine
level than has been previously available. ⊗⊗⊗⊗greenblatt⊗⊗⊗⊗
eval and freinds
prog as extension
assignment and rplaca-d
lambda's
.SS(Input/Output: %3read%* and %3print%*,,P13:)
*****sdio
atttribute grams
recursive descent
.TURN ON "←";
.BEGIN "FOO";TURN OFF "α";
Any implementation of LISP is simplified dramatically since
a very large part of LISP can be written in LISP itself.
We have already seen that the
evaluation process is expressible this way, and we will exploit this
property again later in dealing with compilers.
Here wee will show that many of
the LISP input and output routines can be
written as LISP functions calling a very few primitive routines.
The primitive routines can also be described in LISP,
though they would normally be coded in machine language.
The primitive functions are ⊗→%3ratom%*↔← and ⊗→%3patom%*↔←.
.BEGIN INDENT 0,7;
.GROUP
%3ratom[ ]%* is a function of no arguments. It reads the input string, constructing
the next atom
or special character (left parenthesis or right parenthesis).
It looks up that object in the atom table and returns
a pointer to that table entry. If no entry
is found an appropriate entry is made.
%3ratom%* flushes spaces and commas,
recognizing them as delimiters. It returns only atoms or special characters
to %3read%*.
.APART
.INDENT 0,8;
.GROUP
%3patom[x]%* is a function of one argument expecting an atom, left paren,
right paren, blank, or dot as the value of its argument.
It will display the represented object on the output
device.
.APART
.END
%3ratom%* is the LISP %2⊗→scanner↔←%*.
A scanner must negotiate with the actual input device for input
characters. The scanner builds the most basic ingredients, like identifiers
or numbers, and only after
such a basic block has been recognized is the next level of syntax
analysis attempted.
The units, also called tokens, which the scanner has built are passed to the
%2⊗→parser↔←%*.
A parser encodes the BNF equations
of the class of well-formed input expressions and determines whether or not
the input stream is a well-formed expression.
The parser builds a tree-representation of the input string;
the structure of the tree reflects the language constructs which were present in the
input.
%3read%* is the LISP ⊗→parser↔←;
it will recognize both S-expression and list-notation.
Besides analyzing the input stream, %3read%*
also builds an S-expression representation of the input.
To simplify matters, we need to refer to atoms which will print
the characters "%3)%*", "%3(%*", and " "#(blank).
We will assume that %3RPAR%*, %3LPAR%*,
and %3BLANK%* denote such atoms. For example, if the next input
character is "(" then
←%3eq[ratom[ ];LPAR]%* is true (and the input pointer is moved to
the next character!).
%3patom[RPAR]%* will have the effect of printing a %2")"%* on the output
device.
.BEGIN TABIT2(17,21);TURN OFF"←";SELECT 3;
.BEGIN FILL;select 1;
The LISP parsr, %3read%*, will return a representation of an atom or list, %9α%1.
The call on
%3err%1 will terminate the input scanning
immediately.
%3read_rest%* will translate strings %9α%* acceptable in the context "%3(%9α%1".
Thus %9α%* being %3"A)"%* or %3"A#(B#C))"%* would be suitable for %3read_rest%*,
since %3(A)%* and %3(A#(B#C))%* are lists.
.END
.GROUP
.BEGIN TABIT2(11,15);TURN OFF"←";SELECT 3;
%3read%* <=λ[[ ] read%41%3[ratom[ ]]
read%41%3[j] <=\[atom[j] → j;
\\ is_lpar[j] → read_head[ ];
\\ %et%3 → err[ ]]
.END
.APART
.GROUP;
.BEGIN FILL;SELECT 1;
%3read_rest%* is looking for legal %9α%*'s in the context "%3(%9α%1".
If it sees:
.END
.BEGIN NOFILL;SELECT 1;
\an atom, then %9α%1 is <atom>%9β%3)%1;
\a left parenthesis, then %9α%1 is %3(%9β%3)%9∂%3)%1;
\a right parenthesis, then %9α%1 is %3)%1.
.END
.APART
.GROUP
.BEGIN TABIT2(17,21);TURN OFF"←";SELECT 3;
%3read_rest%*↔← <= λ[[ ]read_restα%41%3[ratom[ ]]]
read_rest%41%3[j] <=\[atom[j] → concat[j;read_rest[]];
\\ is_lpar[j] → conct[read_rest[ ];read_rest[ ]];
\\ is_rpar[j] → NIL];
\\ %et%3 → err[]]
.END
.APART
.END
.GROUP;select 1;
Finally, here are some of the primitive recognizers which these functions use.
.BEGIN CENTER;SELECT 3;
is_dot[x] <= eq[x;PERIOD] is_lpar[x] <= eq[x;LPAR], ...%1 etc.
.END
.APART
*****how simple can you get
*****quote pratt ****
****do bnf
****recursive descent-decent-dissent
Here are %3print%* and friends. %3terpri%* initiates a new output line.
.BEGIN TABIT3(10,13,25);TURN OFF "←";SELECT 3;
.GROUP
.BEGIN CENTER;select 3;
⊗→%3print%*↔← <= λ[[x] prin0[x]; terpri[ ]; x].
.END
%1The value of %3print[x]%* is %3x%*.
.APART
.BEGIN GROUP;SELECT 3;
⊗→%3prin0%*↔←%3 <= λ[[x]prog[[j]
\\[atom[x] → return[patom[x]]];
\\j ← x;
\\patom[LPAR];
\a3\prin0[car[j]];
\\[null[cdr[j]] →\patom[RPAR];
\\\return[x]];
\\patom[BLANK];
\\j ← cdr[j];
\\go[a3] ]]
.END
.BEGIN FILL;SELECT 1;
Notice that we have used the extended
λ-expressions and conditional expression as described
in {yonss(P253)}⊗↓Notice too that %3print[(A#.(B#.#C))]%* prints as
%3(A#B#.#C)%1. This is because %3print%1 doesn't know that the structure is
not a list until it sees the last dotted-pair. There are two ways of handling this:
either require a type-code, telling whether the structure is a dotted pair or
a list, represented as a dotted pair. Then %2all%1 dotted pairs are printed in dot
notation, and %2all%1 lists are printed in list notation. The other alternative
is to first examine the structure; if it is a list representation, then print it
that way, otherwise print it as a dotted pair. This problem is another
indication of "object vs. representation".←.
.END
.APART
.END
The basic %3print%1 routine allows us to print data structures and program
representations. However the printer will
print duplications for a list structure which has shared
branches and, worse yet, will not terminate if it is given a
circular structure. Some implementations of LISP remedy this ailment;
see ⊗↑[Ter#75]↑ or ⊗↑[Int#75]↑.
.GROUP;
The output format is a simple linear string
of atoms, numbers, spaces, and parens.
For example a %3print%1-based
program for printing function definitions might print the following as the
definition of %3member%1:
.BEGIN FILL;SELECT 3;INDENT 20,20,20;
.P245:
(MEMBER (LAMBDA (X L) (COND ((NULL L) NIL) ((EQ X (FIRST L)) T) (T (MEMBER X (REST L))))))
.END
.APART
The print routine can break the text at the end of any atom⊗↓Some implementations
even allow the printer to break in the middle of an atom. This is accomplished
by designating a special character for carriage control, and the %3read%1 routine
knows to ignore the immediately following end-of-line sequence.←; the only
restriction we place on printing of expressions is that what is %3print%1-ed
must be %3read%1-able.
Even with a small definition like this, we have difficulty deciphering
the structure. When functions or lists become large and deeply nested
the readability becomes unmanageable. Most implementations of LISP supply
formatting programs called "pretty-printers" or "grinders" to supplement
the basic print routine.
.GROUP;
A pretty-printer might print %3member%1 as:
.BEGIN SELECT 3;TABIT1(10);
(MEMBER
(LAMBDA (X L)
(COND\((NULL L) NIL)
\((EQ X (FIRST L)) T)
\(T (MEMBER X (REST L))))))
.END
.APART
So far we have thrown all the I/O back on %3ratom%* and %3patom%*. Clearly
%3ratom%* will be more interesting. All %3patom%* need do is get the p-name and
print it. %3ratom%* needs to do an efficient search of the atom table and if
the atom is not found, add it to the table. All %3ratom%* has to work
with is the actual character string which
will be the p-name of some atom. What %3ratom%* could do is look at the
p-name of each atom currently in the table of atoms; when it finds a match
it returns a pointer to that atom.
This is essentially the search scheme of %3assoc%*.
***do assoc****
If the appropriate
atom is not found it can build a new one consisting of the p-name, add it
to the table, and return a pointer to this new entry.
In the next section we will introduce an alternative scheme called hashing.
*****set up for prini and implementation****
IV
implementation of the static structurre
for ... machine
gc
hash
read
print